Space Efficient Linear Time Lempel-Ziv Factorization on Constant~Size~Alphabets
نویسندگان
چکیده
We present a new algorithm for computing the Lempel-Ziv Factorization (LZ77) of a given string of length N in linear time, that utilizes only N logN+O(1) bits of working space, i.e., a single integer array, for constant size integer alphabets. This greatly improves the previous best space requirement for linear time LZ77 factorization (Kärkkäinen et al. CPM 2013), which requires two integer arrays of length N . Computational experiments show that despite the added complexity of the algorithm, the speed of the algorithm is only around twice as slow as previous fastest linear time algorithms.
منابع مشابه
Fast Lempel-Ziv Decompression in Linear Space
We consider the problem of decompressing the Lempel-Ziv 77 representation of a string S ∈ [σ] using a working space as close as possible to the size z of the input. The folklore solution for the problem runs in optimal O(n) time but requires random access to the whole decompressed text. A better solution is to convert LZ77 into a grammar of size O(z log(n/z)) and then stream S in optimal linear...
متن کاملSmall-space encoding LCE data structure with constant-time queries
The longest common extension (LCE) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(zτ + n τ ) words of space which answers LCE queries in O(1) time and can be built in O(n log σ) time, where 1 ≤ τ ≤ √ n is a parame...
متن کاملConstructing LZ78 Tries and Position Heaps in Linear Time for Large Alphabets
We present the first worst-case linear-time algorithm to compute the Lempel-Ziv 78 factorization of a given string over an integer alphabet. Our algorithm is based on nearest marked ancestor queries on the suffix tree of the given string. We also show that the same technique can be used to construct the position heap of a set of strings in worst-case linear time, when the set of strings is give...
متن کاملLempel Ziv Computation in Small Space (LZ-CISS)
For both the Lempel Ziv 77and 78-factorization we propose algorithms generating the respective factorization using (1 + ǫ)n lg n+O(n) bits (for any positive constant ǫ ≤ 1) working space (including the space for the output) for any text of size n over an integer alphabet in O ( n/ǫ )
متن کاملLinear Time Lempel-Ziv Factorization: Simple, Fast, Small
Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in many diverse applications, including data compression, text indexing, and pattern discovery. We describe new linear time LZ factorization algorithms, some of which require only 2n log n + O(log n) bits of working space to factorize a string of length n. These are the most space efficient linear time al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1310.1448 شماره
صفحات -
تاریخ انتشار 2013